Homework review on Monday

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     ── Conflicts ────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(GGally)
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
library(modelr)

red_wine <- read_csv("data/wine_quality_red.csv")
Rows: 1599 Columns: 14── Column specification ────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): region
dbl (13): wine_id, fixed_acidity, volatile_acidity, citric_acid, residual_sugar, chlorides, ...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
white_wine <- read_csv("data/wine_quality_white.csv")
Rows: 4898 Columns: 14── Column specification ────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr  (1): region
dbl (13): wine_id, fixed_acidity, volatile_acidity, citric_acid, residual_sugar, chlorides, ...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

set up method

try white model first

# set up test-train split
n_data_white <- nrow(white_wine)
test_prop <- 0.1

test_index_white <- sample(1:n_data_white, size = n_data_white*test_prop)
test_white <- slice(white_wine, test_index_white)
train_white <- slice(white_wine, -test_index_white)

Note any transformations we apply to our training set from now on, we can recapitulate in our predictive model formula later for our test data.

explore the data

skimr::skim(train_white) %>% view()
train_white %>% 
  select(quality, 2:7) %>% 
  ggpairs(progress = F)

Can see slight negative correlation in fixed_acidity.

Looking at chlorides, maybe the middle amount is the best: could transform to abs(diff_from_mean) or (diff_from_mean)^2 to frame it as extreme v middle values. Could also make them into z-scores (x - mu / sd). Maybe also similar situ for citric_acid.

train_white %>% 
  select(quality, 8:14) %>% 
  ggpairs(progress = F)

Could believe alcohol is positively correlated, also knowing this influences beer quality score (prior domain knowledge). Other ones don’t look like correlations, maybe some extreme values influencing.

Region doesn’t look like it has any influence, not worth including.

Seeing correlations between predictors, e.g. alcohol and density - but Jamie wouldn’t chuck out because of this, could use interaction, could decide not to include density if already have alcohol, etc, better to include to have this choice.

So we have: alcohol (+ve, also correlated with density), fixed_acidity (-ve) and maybe a transformed measure of chlorides and citric_acid.

transform

train_white_fe %>% 
  ggplot(aes(x = chlorides_diff_to_mean, y = quality)) +
  geom_point() +
  geom_smooth(method = lm)


train_white_fe %>% 
  ggplot(aes(x = citric_acid_diff_to_mean, y = quality)) +
  geom_point() +
  geom_smooth(method = lm)

Slight negative correlation for chlorides, more negative correlation for citric_acid.

Not sure this is working.

Need to have a look again and put quality on the y-axis - in ggpairs, put quality as the last var not the first one:

train_white_fe %>% 
  ggplot(aes(x = chlorides, y = quality)) +
  geom_point()


train_white_fe %>% 
  ggplot(aes(x = citric_acid, y = quality)) +
  geom_point()

Now we see that there is not the shape that high/low quality is high/low chlorides. Citric acid still seems to have this shape though, might be worth keeping.

Don’t use the chloride transformation as we move forward. Do consider citric_acid_diff_to_mean.

Look at correlations with the new log_ vars

train_white_fe %>% 
  select(quality, starts_with("log_")) %>% 
  select(1:8) %>% 
  select(starts_with("log_"), quality) %>%  # put quality last so it's on y-axis vs others
  ggpairs(progress = F)

Maybe also log_volatile_acidity, log_chloride (both -ve)

train_white_fe %>% 
  select(quality, starts_with("log_")) %>% 
  select(1, 9:14) %>% 
  select(starts_with("log_"), quality) %>%  # put quality last so it's on y-axis vs others
  ggpairs(progress = F)

Log_alcohol is strong (but already have alcohol), log_density corr looks like it might be stretched by some extreme values - ignore (but maybe interaction with alcohol, still here with the log values), log_total_sulfur_dioxide might also have this hump shape that we could transform too.

train_white_fe <- train_white_fe %>% 
  mutate(diff_log_total_sulfur_dioxide = abs(log_total_sulfur_dioxide - mean(log_total_sulfur_dioxide)), .after = log_total_sulfur_dioxide)

train_white_fe %>% 
  ggplot(aes(x = diff_log_total_sulfur_dioxide, y = quality)) + 
  geom_point() +
  geom_smooth(method = lm)

There might be something here too.

Features of interest:

  • alcohol (+ve, also correlated with density)
  • fixed_acidity (-ve)
  • citric_acid_diff_to_mean (-ve)
  • diff_log_total_sulfur_dioxide
  • log_volatile_acidity
  • log_chloride

start modelling

forward stepwise

summary(mod1)

Call:
lm(formula = quality ~ alcohol, data = train_white_fe)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5112 -0.5511 -0.0212  0.5450  3.2949 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.52660    0.11066   22.83   <2e-16 ***
alcohol      0.32005    0.01045   30.64   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8522 on 4407 degrees of freedom
Multiple R-squared:  0.1756,    Adjusted R-squared:  0.1754 
F-statistic: 938.6 on 1 and 4407 DF,  p-value: < 2.2e-16
# check remaining residuals
train_white_fe %>% 
  add_predictions(mod1) %>% 
  add_residuals(mod1) %>% 
  select(2:8, resid) %>% # for first 7 vars
  ggpairs(progress = F)

train_white_fe %>% 
  add_predictions(mod1) %>% 
  add_residuals(mod1) %>% 
  select(9:15, resid) %>% # for first 6 vars
  ggpairs(progress = F)

train_white_fe %>% 
  add_predictions(mod1) %>% 
  add_residuals(mod1) %>% 
  select(16:23, resid) %>% # for first 6 vars
  ggpairs(progress = F)

train_white_fe %>% 
  add_predictions(mod1) %>% 
  add_residuals(mod1) %>% 
  select(24:30, resid) %>% # for first 6 vars
  ggpairs(progress = F)

Maybe volatile acidity(log and normal) is a good next one, maybe also log_sulphur_dioxide, log_total_sulfur_dioxide

add volatile acidity

library(ggfortify)
autoplot(mod2)

Looks good for our assumptions of linear regression

summary(mod2)

Adj R2 is 0.2241 (vs 0.1754 for mod 1) so looking like a better model here

anova(mod1, mod2)
Analysis of Variance Table

Model 1: quality ~ alcohol
Model 2: quality ~ alcohol + volatile_acidity
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1   4407 3200.8                                  
2   4406 3011.2  1     189.6 277.42 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Yes it’s a different model and reduces RSS so is an improvement

Validity of including aliases / overlapping variables

summary(mod3)

Call:
lm(formula = quality ~ alcohol + volatile_acidity + log_volatile_acidity, 
    data = train_white_fe)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.5983 -0.5386 -0.0175  0.5206  3.2919 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.16962    0.13576  23.346   <2e-16 ***
alcohol               0.33098    0.01015  32.609   <2e-16 ***
volatile_acidity      1.81094    1.67203   1.083   0.2788    
log_volatile_acidity -5.19925    2.24540  -2.316   0.0206 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8263 on 4405 degrees of freedom
Multiple R-squared:  0.2254,    Adjusted R-squared:  0.2248 
F-statistic: 427.2 on 3 and 4405 DF,  p-value: < 2.2e-16

Including the log_ one makes ordinary one normal, and even flips the sign of the coefficient. Don’t want to include both of these.

Could try mod2 but with the log_ one instead, and compare to original mod2. Overall, best to make as few models as possible, so this isn’t great (given stepwise’s flaws).

mod2b <- lm(quality ~ alcohol + log_volatile_acidity, data = train_white_fe)

summary(mod2)

Call:
lm(formula = quality ~ alcohol + volatile_acidity, data = train_white_fe)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.6027 -0.5432 -0.0098  0.5224  3.2785 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)       2.98798    0.11086   26.95   <2e-16 ***
alcohol           0.33049    0.01015   32.55   <2e-16 ***
volatile_acidity -2.05016    0.12309  -16.66   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8267 on 4406 degrees of freedom
Multiple R-squared:  0.2244,    Adjusted R-squared:  0.2241 
F-statistic: 637.4 on 2 and 4406 DF,  p-value: < 2.2e-16
summary(mod2b)

Call:
lm(formula = quality ~ alcohol + log_volatile_acidity, data = train_white_fe)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.6008 -0.5388 -0.0128  0.5219  3.2856 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.08709    0.11237   27.47   <2e-16 ***
alcohol               0.33080    0.01015   32.59   <2e-16 ***
log_volatile_acidity -2.77389    0.16522  -16.79   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8263 on 4406 degrees of freedom
Multiple R-squared:  0.2252,    Adjusted R-squared:  0.2248 
F-statistic: 640.1 on 2 and 4406 DF,  p-value: < 2.2e-16

Mod2b is slightly better than mod2, not by much!

Remember also that log_ makes it harder to communicate, but can find ways to translate this for our audience (“X influences Y”; “X increases Y”; “X increases Y, in that when you multiply X by 10, Y increases by 1”, “volatile_acidity has a log normal distribution” (if true that log_ normalises the dist))

A better sense might be to look at correlations in ggpairs() and pick the more closely correlated ones.

In terms of truly aliased variables (like a, b and c where a + b = c), we do not to remove aliases here as a hyigene factor before modelling.

CV

can do this multiple times - with non-transformed, with log_transformed, with subset of 4 vars (e.g. if have a sensible guess or from this initial stepwise exploration) –> then compare these output models to see which is best.

test/train split first and you have an unused test 10% that can then use to test the final models and understand predictive accuracy.

So, will hit this more later (including “hyperparameter training”) but consider:

  • test/train split first - need a leftover test set (validation set) to tell us how good our trained model was, which of our trained models to use (which hyperparameters were best - the setup of our model process)
  • do some exploration first to understand vars, even with some stepwise regression to get to which small subset of vars might be useful to try alongside all (outcome ~ .) for transformed and untransformed vars
  • use K-fold CV to finetune the success measures, have a few model options to try when predicting outcome for test data

hyperparameters

Basically, anything in machine learning and deep learning that you decide their values or choose their configuration before training begins and whose values or configuration will remain the same when training ends is a hyperparameter. Here are some common examples: - Train-test split ratio - Learning rate in optimization algorithms (e.g. gradient descent) - Choice of optimization algorithm (e.g., gradient descent, stochastic gradient descent, or Adam optimizer)

source: https://towardsdatascience.com/parameters-and-hyperparameters-aa609601a9ac

LS0tCnRpdGxlOiAiSmFtaWUncyB3aW5lIG1vZGVsIgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgpIb21ld29yayByZXZpZXcgb24gTW9uZGF5CgpgYGB7cn0KbGlicmFyeSh0aWR5dmVyc2UpCmxpYnJhcnkoR0dhbGx5KQpsaWJyYXJ5KG1vZGVscikKCnJlZF93aW5lIDwtIHJlYWRfY3N2KCJkYXRhL3dpbmVfcXVhbGl0eV9yZWQuY3N2IikKd2hpdGVfd2luZSA8LSByZWFkX2NzdigiZGF0YS93aW5lX3F1YWxpdHlfd2hpdGUuY3N2IikKYGBgCgojIHNldCB1cCBtZXRob2QKCiogd2hhdCBnb29kbmVzcyBvZiBmaXQgbWVhc3VyZT8gYWRqIFIyLCBBSUMvQklDCiogdGVzdC10cmFpbiBzcGxpdD8geWVzOyA5MDoxMCAoMTYwIHRlc3QgaXMgb2sgZm9yIHNtYWxsZXN0IGRmLCByZWQgd2luZSkKKiBLLWZvbGQgQ1Y/IHllcyAtIHRvIHByZXZlbnQgb3ZlcmZpdHRpbmcKKiBtb2RlbD8gb3IgbW9kZWxzPyByZWQvd2hpdGUgc2VwYXJhdGVseQoKdHJ5IHdoaXRlIG1vZGVsIGZpcnN0CgpgYGB7cn0KIyBzZXQgdXAgdGVzdC10cmFpbiBzcGxpdCBmaXJzdApuX2RhdGFfd2hpdGUgPC0gbnJvdyh3aGl0ZV93aW5lKQp0ZXN0X3Byb3AgPC0gMC4xCgp0ZXN0X2luZGV4X3doaXRlIDwtIHNhbXBsZSgxOm5fZGF0YV93aGl0ZSwgc2l6ZSA9IG5fZGF0YV93aGl0ZSp0ZXN0X3Byb3ApCnRlc3Rfd2hpdGUgPC0gc2xpY2Uod2hpdGVfd2luZSwgdGVzdF9pbmRleF93aGl0ZSkKdHJhaW5fd2hpdGUgPC0gc2xpY2Uod2hpdGVfd2luZSwgLXRlc3RfaW5kZXhfd2hpdGUpCmBgYAoKTm90ZSBhbnkgdHJhbnNmb3JtYXRpb25zIHdlIGFwcGx5IHRvIG91ciB0cmFpbmluZyBzZXQgZnJvbSBub3cgb24sIHdlIGNhbiByZWNhcGl0dWxhdGUgaW4gb3VyIHByZWRpY3RpdmUgbW9kZWwgZm9ybXVsYSBsYXRlciBmb3Igb3VyIHRlc3QgZGF0YS4KCiMgZXhwbG9yZSB0aGUgZGF0YQoKKiB3aGF0IGRvZXMgMSByb3cgcmVwcmVzZW50PyAxIHJvdyA9IDEgd2luZSwgcXVhbGl0eSBzY29yZSBpcyBhbiBhdmVyYWdlCiogd2hhdCB2YXJzIGRvIHdlIGhhdmU/IHBoeXNpY29jaGVtaWNhbCBwcm9wZXJ0aWVzIG9mIHRoZSB3aW5lLCBwbHVzIHJlZ2lvbgoqIHdoYXQgdHJhbnNmb3JtYXRpb25zIG1pZ2h0IG1ha2Ugc2Vuc2U/IHNvbWUgc2tld2VkIGRpc3RyaWJ1dGlvbnMgbWlnaHQgbmVlZCB0cmFuc2Zvcm1pbmcKKiB3aGF0IGluZmx1ZW5jZXMgcXVhbGl0eT8gZG9uJ3Qga25vdyB5ZXQsIHdpbGwgbmVlZCB0byBleHBsb3JlIGNvcnJlbGF0aW9uczsgZGF0YSBkaWN0aW9uYXJ5IGhhcyBzb21lIGludGVyZXN0aW5nIGluZm8KCmBgYHtyfQpza2ltcjo6c2tpbSh0cmFpbl93aGl0ZSkgJT4lIHZpZXcoKQpgYGAKCmBgYHtyfQp0cmFpbl93aGl0ZSAlPiUgCiAgc2VsZWN0KHF1YWxpdHksIDI6NykgJT4lIAogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCkNhbiBzZWUgc2xpZ2h0IG5lZ2F0aXZlIGNvcnJlbGF0aW9uIGluIGZpeGVkX2FjaWRpdHkuCgpMb29raW5nIGF0IGNobG9yaWRlcywgbWF5YmUgdGhlIG1pZGRsZSBhbW91bnQgaXMgdGhlIGJlc3Q6IGNvdWxkIHRyYW5zZm9ybSB0byBhYnMoZGlmZl9mcm9tX21lYW4pIG9yIChkaWZmX2Zyb21fbWVhbileMiB0byBmcmFtZSBpdCBhcyBleHRyZW1lIHYgbWlkZGxlIHZhbHVlcy4gQ291bGQgYWxzbyBtYWtlIHRoZW0gaW50byB6LXNjb3JlcyAoeCAtIG11IC8gc2QpLgpNYXliZSBhbHNvIHNpbWlsYXIgc2l0dSBmb3IgY2l0cmljX2FjaWQuCgoKYGBge3J9CnRyYWluX3doaXRlICU+JSAKICBzZWxlY3QocXVhbGl0eSwgODoxNCkgJT4lIAogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCkNvdWxkIGJlbGlldmUgYWxjb2hvbCBpcyBwb3NpdGl2ZWx5IGNvcnJlbGF0ZWQsIGFsc28ga25vd2luZyB0aGlzIGluZmx1ZW5jZXMgYmVlciBxdWFsaXR5IHNjb3JlIChwcmlvciBkb21haW4ga25vd2xlZGdlKS4gT3RoZXIgb25lcyBkb24ndCBsb29rIGxpa2UgY29ycmVsYXRpb25zLCBtYXliZSBzb21lIGV4dHJlbWUgdmFsdWVzIGluZmx1ZW5jaW5nLgoKUmVnaW9uIGRvZXNuJ3QgbG9vayBsaWtlIGl0IGhhcyBhbnkgaW5mbHVlbmNlLCBub3Qgd29ydGggaW5jbHVkaW5nLgoKU2VlaW5nIGNvcnJlbGF0aW9ucyBiZXR3ZWVuIHByZWRpY3RvcnMsIGUuZy4gYWxjb2hvbCBhbmQgZGVuc2l0eSAtIGJ1dCBKYW1pZSB3b3VsZG4ndCBjaHVjayBvdXQgYmVjYXVzZSBvZiB0aGlzLCBjb3VsZCB1c2UgaW50ZXJhY3Rpb24sIGNvdWxkIGRlY2lkZSBub3QgdG8gaW5jbHVkZSBkZW5zaXR5IGlmIGFscmVhZHkgaGF2ZSBhbGNvaG9sLCBldGMsIGJldHRlciB0byBpbmNsdWRlIHRvIGhhdmUgdGhpcyBjaG9pY2UuCgpTbyB3ZSBoYXZlOiBhbGNvaG9sICgrdmUsIGFsc28gY29ycmVsYXRlZCB3aXRoIGRlbnNpdHkpLCBmaXhlZF9hY2lkaXR5ICgtdmUpIGFuZCBtYXliZSBhIHRyYW5zZm9ybWVkIG1lYXN1cmUgb2YgY2hsb3JpZGVzIGFuZCBjaXRyaWNfYWNpZC4KCiMjIHRyYW5zZm9ybQoKYGBge3J9CnRyYWluX3doaXRlX2ZlIDwtIHRyYWluX3doaXRlICU+JSAKICAjIGxvZyB0cmFuc2Zvcm0KICBtdXRhdGUoYWNyb3NzKC5jb2xzID0gd2hlcmUoaXMubnVtZXJpYyksCiAgICAgICAgICAgICAgICAuZm5zID0gfiBsb2coMSArIC54KSwKICAgICAgICAgICAgICAgIC5uYW1lcyA9ICJsb2dfey5jb2x9IikpICU+JSAjIHRoaXMgaXMgaG93IHRvIGNyZWF0ZSBuZXcgY29scyBpbnN0ZWFkIG9mIG92ZXJ3cml0aW5nCiAgIyB0cmFuc2Zvcm0gdGhlIGV4dHJlbWUgdmFsdWUgaHVtcCBkYXRhCiAgbXV0YXRlKGNobG9yaWRlc19kaWZmX3RvX21lYW4gPSBhYnMoY2hsb3JpZGVzIC0gbWVhbihjaGxvcmlkZXMpKSwgLmFmdGVyID0gY2hsb3JpZGVzKSAlPiUgICMgY2hlY2sgc2tpbSwgc2VlIG1lYW4gfj0gbWVkaWFuCiAgbXV0YXRlKGNpdHJpY19hY2lkX2RpZmZfdG9fbWVhbiA9IGFicyhjaXRyaWNfYWNpZCAtIG1lYW4oY2l0cmljX2FjaWQpKSwgLmFmdGVyID0gY2l0cmljX2FjaWQpCgpoZWFkKHRyYWluX3doaXRlX2ZlKQpgYGAKCmBgYHtyfQp0cmFpbl93aGl0ZV9mZSAlPiUgCiAgZ2dwbG90KGFlcyh4ID0gY2hsb3JpZGVzX2RpZmZfdG9fbWVhbiwgeSA9IHF1YWxpdHkpKSArCiAgZ2VvbV9wb2ludCgpICsKICBnZW9tX3Ntb290aChtZXRob2QgPSBsbSkKCnRyYWluX3doaXRlX2ZlICU+JSAKICBnZ3Bsb3QoYWVzKHggPSBjaXRyaWNfYWNpZF9kaWZmX3RvX21lYW4sIHkgPSBxdWFsaXR5KSkgKwogIGdlb21fcG9pbnQoKSArCiAgZ2VvbV9zbW9vdGgobWV0aG9kID0gbG0pCmBgYAoKU2xpZ2h0IG5lZ2F0aXZlIGNvcnJlbGF0aW9uIGZvciBjaGxvcmlkZXMsIG1vcmUgbmVnYXRpdmUgY29ycmVsYXRpb24gZm9yIGNpdHJpY19hY2lkLgoKTm90IHN1cmUgdGhpcyBpcyB3b3JraW5nLgoKTmVlZCB0byBoYXZlIGEgbG9vayBhZ2FpbiBhbmQgcHV0IHF1YWxpdHkgb24gdGhlIHktYXhpcyAtIGluIGdncGFpcnMsIHB1dCBxdWFsaXR5IGFzIHRoZSBfbGFzdF8gdmFyIG5vdCB0aGUgZmlyc3Qgb25lOgoKYGBge3J9CnRyYWluX3doaXRlX2ZlICU+JSAKICBnZ3Bsb3QoYWVzKHggPSBjaGxvcmlkZXMsIHkgPSBxdWFsaXR5KSkgKwogIGdlb21fcG9pbnQoKQoKdHJhaW5fd2hpdGVfZmUgJT4lIAogIGdncGxvdChhZXMoeCA9IGNpdHJpY19hY2lkLCB5ID0gcXVhbGl0eSkpICsKICBnZW9tX3BvaW50KCkKYGBgCgpOb3cgd2Ugc2VlIHRoYXQgdGhlcmUgaXMgbm90IHRoZSBzaGFwZSB0aGF0IGhpZ2gvbG93IHF1YWxpdHkgaXMgaGlnaC9sb3cgY2hsb3JpZGVzLiBDaXRyaWMgYWNpZCBzdGlsbCBzZWVtcyB0byBoYXZlIHRoaXMgc2hhcGUgdGhvdWdoLCBtaWdodCBiZSB3b3J0aCBrZWVwaW5nLgoKRG9uJ3QgdXNlIHRoZSBjaGxvcmlkZSB0cmFuc2Zvcm1hdGlvbiBhcyB3ZSBtb3ZlIGZvcndhcmQuIERvIGNvbnNpZGVyIGNpdHJpY19hY2lkX2RpZmZfdG9fbWVhbi4KCkxvb2sgYXQgY29ycmVsYXRpb25zIHdpdGggdGhlIG5ldyBsb2dfIHZhcnMKCmBgYHtyfQp0cmFpbl93aGl0ZV9mZSAlPiUgCiAgc2VsZWN0KHF1YWxpdHksIHN0YXJ0c193aXRoKCJsb2dfIikpICU+JSAKICBzZWxlY3QoMTo4KSAlPiUgCiAgc2VsZWN0KHN0YXJ0c193aXRoKCJsb2dfIiksIHF1YWxpdHkpICU+JSAgIyBwdXQgcXVhbGl0eSBsYXN0IHNvIGl0J3Mgb24geS1heGlzIHZzIG90aGVycwogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCk1heWJlIGFsc28gbG9nX3ZvbGF0aWxlX2FjaWRpdHksIGxvZ19jaGxvcmlkZSAoYm90aCAtdmUpCgpgYGB7cn0KdHJhaW5fd2hpdGVfZmUgJT4lIAogIHNlbGVjdChxdWFsaXR5LCBzdGFydHNfd2l0aCgibG9nXyIpKSAlPiUgCiAgc2VsZWN0KDEsIDk6MTQpICU+JSAKICBzZWxlY3Qoc3RhcnRzX3dpdGgoImxvZ18iKSwgcXVhbGl0eSkgJT4lICAjIHB1dCBxdWFsaXR5IGxhc3Qgc28gaXQncyBvbiB5LWF4aXMgdnMgb3RoZXJzCiAgZ2dwYWlycyhwcm9ncmVzcyA9IEYpCmBgYAoKTG9nX2FsY29ob2wgaXMgc3Ryb25nIChidXQgYWxyZWFkeSBoYXZlIGFsY29ob2wpLCBsb2dfZGVuc2l0eSBjb3JyIGxvb2tzIGxpa2UgaXQgbWlnaHQgYmUgc3RyZXRjaGVkIGJ5IHNvbWUgZXh0cmVtZSB2YWx1ZXMgLSBpZ25vcmUgKGJ1dCBtYXliZSBpbnRlcmFjdGlvbiB3aXRoIGFsY29ob2wsIHN0aWxsIGhlcmUgd2l0aCB0aGUgbG9nIHZhbHVlcyksIGxvZ190b3RhbF9zdWxmdXJfZGlveGlkZSBtaWdodCBhbHNvIGhhdmUgdGhpcyBodW1wIHNoYXBlIHRoYXQgd2UgY291bGQgdHJhbnNmb3JtIHRvby4KCmBgYHtyfQp0cmFpbl93aGl0ZV9mZSA8LSB0cmFpbl93aGl0ZV9mZSAlPiUgCiAgbXV0YXRlKGRpZmZfbG9nX3RvdGFsX3N1bGZ1cl9kaW94aWRlID0gYWJzKGxvZ190b3RhbF9zdWxmdXJfZGlveGlkZSAtIG1lYW4obG9nX3RvdGFsX3N1bGZ1cl9kaW94aWRlKSksIC5hZnRlciA9IGxvZ190b3RhbF9zdWxmdXJfZGlveGlkZSkKCnRyYWluX3doaXRlX2ZlICU+JSAKICBnZ3Bsb3QoYWVzKHggPSBkaWZmX2xvZ190b3RhbF9zdWxmdXJfZGlveGlkZSwgeSA9IHF1YWxpdHkpKSArIAogIGdlb21fcG9pbnQoKSArCiAgZ2VvbV9zbW9vdGgobWV0aG9kID0gbG0pCmBgYAoKVGhlcmUgbWlnaHQgYmUgc29tZXRoaW5nIGhlcmUgdG9vLgoKRmVhdHVyZXMgb2YgaW50ZXJlc3Q6IAoKKiBhbGNvaG9sICgrdmUsIGFsc28gY29ycmVsYXRlZCB3aXRoIGRlbnNpdHkpCiogZml4ZWRfYWNpZGl0eSAoLXZlKQoqIGNpdHJpY19hY2lkX2RpZmZfdG9fbWVhbiAoLXZlKQoqIGRpZmZfbG9nX3RvdGFsX3N1bGZ1cl9kaW94aWRlCiogbG9nX3ZvbGF0aWxlX2FjaWRpdHkKKiBsb2dfY2hsb3JpZGUKCiMgc3RhcnQgbW9kZWxsaW5nCgpmb3J3YXJkIHN0ZXB3aXNlCgpgYGB7cn0KbW9kMSA8LSBsbShxdWFsaXR5IH4gYWxjb2hvbCwgdHJhaW5fd2hpdGVfZmUpCnN1bW1hcnkobW9kMSkKYGBgCgpgYGB7cn0KIyBjaGVjayByZW1haW5pbmcgcmVzaWR1YWxzCnRyYWluX3doaXRlX2ZlICU+JSAKICBhZGRfcHJlZGljdGlvbnMobW9kMSkgJT4lIAogIGFkZF9yZXNpZHVhbHMobW9kMSkgJT4lIAogIHNlbGVjdCgyOjgsIHJlc2lkKSAlPiUgIyBmb3IgZmlyc3QgNyB2YXJzCiAgZ2dwYWlycyhwcm9ncmVzcyA9IEYpCmBgYAoKYGBge3J9CnRyYWluX3doaXRlX2ZlICU+JSAKICBhZGRfcHJlZGljdGlvbnMobW9kMSkgJT4lIAogIGFkZF9yZXNpZHVhbHMobW9kMSkgJT4lIAogIHNlbGVjdCg5OjE1LCByZXNpZCkgJT4lICMgZm9yIGZpcnN0IDYgdmFycwogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCgpgYGB7cn0KdHJhaW5fd2hpdGVfZmUgJT4lIAogIGFkZF9wcmVkaWN0aW9ucyhtb2QxKSAlPiUgCiAgYWRkX3Jlc2lkdWFscyhtb2QxKSAlPiUgCiAgc2VsZWN0KDE2OjIzLCByZXNpZCkgJT4lICMgZm9yIGZpcnN0IDYgdmFycwogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCgpgYGB7cn0KdHJhaW5fd2hpdGVfZmUgJT4lIAogIGFkZF9wcmVkaWN0aW9ucyhtb2QxKSAlPiUgCiAgYWRkX3Jlc2lkdWFscyhtb2QxKSAlPiUgCiAgc2VsZWN0KDI0OjMwLCByZXNpZCkgJT4lICMgZm9yIGZpcnN0IDYgdmFycwogIGdncGFpcnMocHJvZ3Jlc3MgPSBGKQpgYGAKCgpNYXliZSB2b2xhdGlsZSBhY2lkaXR5KGxvZyBhbmQgbm9ybWFsKSBpcyBhIGdvb2QgbmV4dCBvbmUsIG1heWJlIGFsc28gbG9nX3N1bHBodXJfZGlveGlkZSwgbG9nX3RvdGFsX3N1bGZ1cl9kaW94aWRlCgojIyBhZGQgdm9sYXRpbGUgYWNpZGl0eQoKYGBge3J9Cm1vZDIgPC0gbG0ocXVhbGl0eSB+IGFsY29ob2wgKyB2b2xhdGlsZV9hY2lkaXR5LCB0cmFpbl93aGl0ZV9mZSkKYGBgCgoKYGBge3J9CmxpYnJhcnkoZ2dmb3J0aWZ5KQphdXRvcGxvdChtb2QyKQpgYGAKCkxvb2tzIGdvb2QgZm9yIG91ciBhc3N1bXB0aW9ucyBvZiBsaW5lYXIgcmVncmVzc2lvbgoKCmBgYHtyfQpzdW1tYXJ5KG1vZDIpCmBgYAoKQWRqIFIyIGlzIDAuMjI0MSAodnMgMC4xNzU0IGZvciBtb2QgMSkgc28gbG9va2luZyBsaWtlIGEgYmV0dGVyIG1vZGVsIGhlcmUKCmBgYHtyfQphbm92YShtb2QxLCBtb2QyKQpgYGAKClllcyBpdCdzIGEgZGlmZmVyZW50IG1vZGVsIGFuZCByZWR1Y2VzIFJTUyBzbyBpcyBhbiBpbXByb3ZlbWVudAoKIyMgVmFsaWRpdHkgb2YgaW5jbHVkaW5nIGFsaWFzZXMgLyBvdmVybGFwcGluZyB2YXJpYWJsZXMKCmBgYHtyfQptb2QzIDwtIGxtKHF1YWxpdHkgfiBhbGNvaG9sICsgdm9sYXRpbGVfYWNpZGl0eSArIGxvZ192b2xhdGlsZV9hY2lkaXR5LCB0cmFpbl93aGl0ZV9mZSkKCnN1bW1hcnkobW9kMykKYGBgCgpJbmNsdWRpbmcgdGhlIGxvZ18gb25lIG1ha2VzIG9yZGluYXJ5IG9uZSBub3JtYWwsIGFuZCBldmVuIGZsaXBzIHRoZSBzaWduIG9mIHRoZSBjb2VmZmljaWVudC4gRG9uJ3Qgd2FudCB0byBpbmNsdWRlIGJvdGggb2YgdGhlc2UuCgpDb3VsZCB0cnkgbW9kMiBidXQgd2l0aCB0aGUgbG9nXyBvbmUgaW5zdGVhZCwgYW5kIGNvbXBhcmUgdG8gb3JpZ2luYWwgbW9kMi4gT3ZlcmFsbCwgYmVzdCB0byBtYWtlIGFzIGZldyBtb2RlbHMgYXMgcG9zc2libGUsIHNvIHRoaXMgaXNuJ3QgZ3JlYXQgKGdpdmVuIHN0ZXB3aXNlJ3MgZmxhd3MpLgoKYGBge3J9Cm1vZDJiIDwtIGxtKHF1YWxpdHkgfiBhbGNvaG9sICsgbG9nX3ZvbGF0aWxlX2FjaWRpdHksIGRhdGEgPSB0cmFpbl93aGl0ZV9mZSkKCnN1bW1hcnkobW9kMikKc3VtbWFyeShtb2QyYikKYGBgCgpNb2QyYiBpcyBzbGlnaHRseSBiZXR0ZXIgdGhhbiBtb2QyLCBub3QgYnkgbXVjaCEKClJlbWVtYmVyIGFsc28gdGhhdCBsb2dfIG1ha2VzIGl0IGhhcmRlciB0byBjb21tdW5pY2F0ZSwgYnV0IGNhbiBmaW5kIHdheXMgdG8gdHJhbnNsYXRlIHRoaXMgZm9yIG91ciBhdWRpZW5jZSAoIlggaW5mbHVlbmNlcyBZIjsgIlggaW5jcmVhc2VzIFkiOyAiWCBpbmNyZWFzZXMgWSwgaW4gdGhhdCB3aGVuIHlvdSBtdWx0aXBseSBYIGJ5IDEwLCBZIGluY3JlYXNlcyBieSAxIiwgInZvbGF0aWxlX2FjaWRpdHkgaGFzIGEgbG9nIG5vcm1hbCBkaXN0cmlidXRpb24iIChpZiB0cnVlIHRoYXQgbG9nXyBub3JtYWxpc2VzIHRoZSBkaXN0KSkKCkEgYmV0dGVyIHNlbnNlIG1pZ2h0IGJlIHRvIGxvb2sgYXQgY29ycmVsYXRpb25zIGluIGdncGFpcnMoKSBhbmQgcGljayB0aGUgbW9yZSBjbG9zZWx5IGNvcnJlbGF0ZWQgb25lcy4gCgpJbiB0ZXJtcyBvZiB0cnVseSBhbGlhc2VkIHZhcmlhYmxlcyAobGlrZSBhLCBiIGFuZCBjIHdoZXJlIGEgKyBiID0gYyksIHdlIGRvIG5vdCB0byByZW1vdmUgYWxpYXNlcyBoZXJlIGFzIGEgaHlpZ2VuZSBmYWN0b3IgYmVmb3JlIG1vZGVsbGluZy4KCiMjIENWCgpjYW4gZG8gdGhpcyBtdWx0aXBsZSB0aW1lcyAtIHdpdGggbm9uLXRyYW5zZm9ybWVkLCB3aXRoIGxvZ190cmFuc2Zvcm1lZCwgd2l0aCBzdWJzZXQgb2YgNCB2YXJzIChlLmcuIGlmIGhhdmUgYSBzZW5zaWJsZSBndWVzcyBvciBmcm9tIHRoaXMgaW5pdGlhbCBzdGVwd2lzZSBleHBsb3JhdGlvbikgLS0+IHRoZW4gY29tcGFyZSB0aGVzZSBvdXRwdXQgbW9kZWxzIHRvIHNlZSB3aGljaCBpcyBiZXN0LgoKdGVzdC90cmFpbiBzcGxpdCBmaXJzdCBhbmQgeW91IGhhdmUgYW4gdW51c2VkIHRlc3QgMTAlIHRoYXQgY2FuIHRoZW4gdXNlIHRvIHRlc3QgdGhlIGZpbmFsIG1vZGVscyBhbmQgdW5kZXJzdGFuZCBwcmVkaWN0aXZlIGFjY3VyYWN5LgoKU28sIHdpbGwgaGl0IHRoaXMgbW9yZSBsYXRlciAoaW5jbHVkaW5nICJoeXBlcnBhcmFtZXRlciB0cmFpbmluZyIpIGJ1dCBjb25zaWRlcjoKCiogdGVzdC90cmFpbiBzcGxpdCBmaXJzdCAtIG5lZWQgYSBsZWZ0b3ZlciB0ZXN0IHNldCAodmFsaWRhdGlvbiBzZXQpIHRvIHRlbGwgdXMgaG93IGdvb2Qgb3VyIHRyYWluZWQgbW9kZWwgd2FzLCB3aGljaCBvZiBvdXIgdHJhaW5lZCBtb2RlbHMgdG8gdXNlICh3aGljaCBoeXBlcnBhcmFtZXRlcnMgd2VyZSBiZXN0IC0gdGhlIHNldHVwIG9mIG91ciBtb2RlbCBwcm9jZXNzKQoqIGRvIHNvbWUgZXhwbG9yYXRpb24gZmlyc3QgdG8gdW5kZXJzdGFuZCB2YXJzLCBldmVuIHdpdGggc29tZSBzdGVwd2lzZSByZWdyZXNzaW9uIHRvIGdldCB0byB3aGljaCBzbWFsbCBzdWJzZXQgb2YgdmFycyBtaWdodCBiZSB1c2VmdWwgdG8gdHJ5IGFsb25nc2lkZSBhbGwgKG91dGNvbWUgfiAuKSBmb3IgdHJhbnNmb3JtZWQgYW5kIHVudHJhbnNmb3JtZWQgdmFycwoqIHVzZSBLLWZvbGQgQ1YgdG8gZmluZXR1bmUgdGhlIHN1Y2Nlc3MgbWVhc3VyZXMsIGhhdmUgYSBmZXcgbW9kZWwgb3B0aW9ucyB0byB0cnkgd2hlbiBwcmVkaWN0aW5nIG91dGNvbWUgZm9yIHRlc3QgZGF0YQoKIyMjIGh5cGVycGFyYW1ldGVycwoKPiBCYXNpY2FsbHksIGFueXRoaW5nIGluIG1hY2hpbmUgbGVhcm5pbmcgYW5kIGRlZXAgbGVhcm5pbmcgdGhhdCB5b3UgZGVjaWRlIHRoZWlyIHZhbHVlcyBvciBjaG9vc2UgdGhlaXIgY29uZmlndXJhdGlvbiBiZWZvcmUgdHJhaW5pbmcgYmVnaW5zIGFuZCB3aG9zZSB2YWx1ZXMgb3IgY29uZmlndXJhdGlvbiB3aWxsIHJlbWFpbiB0aGUgc2FtZSB3aGVuIHRyYWluaW5nIGVuZHMgaXMgYSBoeXBlcnBhcmFtZXRlci4KSGVyZSBhcmUgc29tZSBjb21tb24gZXhhbXBsZXM6Ci0gVHJhaW4tdGVzdCBzcGxpdCByYXRpbwotIExlYXJuaW5nIHJhdGUgaW4gb3B0aW1pemF0aW9uIGFsZ29yaXRobXMgKGUuZy4gZ3JhZGllbnQgZGVzY2VudCkKLSBDaG9pY2Ugb2Ygb3B0aW1pemF0aW9uIGFsZ29yaXRobSAoZS5nLiwgZ3JhZGllbnQgZGVzY2VudCwgc3RvY2hhc3RpYyBncmFkaWVudCBkZXNjZW50LCBvciBBZGFtIG9wdGltaXplcikKCnNvdXJjZTogaHR0cHM6Ly90b3dhcmRzZGF0YXNjaWVuY2UuY29tL3BhcmFtZXRlcnMtYW5kLWh5cGVycGFyYW1ldGVycy1hYTYwOTYwMWE5YWMKCgoKCgo=